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Since this technology is still new, significant research efforts are needed to ac- 
celerate and simplify the design phases. Mapping is a critical phase in the NoC 
design process, as a mismatch of application software components can signif- 
icantly impact the final system’s performance. Therefore, it is essential to de- 
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1. INTRODUCTION 


The network-on-chip (NoC) concept originated from the need to accommodate the increasing size, 
complexity, and heterogeneity of applications running on system-on-chip (SoC). As the traditional communi- 
cation bus between components failed to fulfill all the demands, the idea of establishing a network on a chip 
was conceived based on computer network principles. A critical stage in designing a NoC is the mapping of 
intellectual property (IP) cores to the architecture. This process has a significant impact on the performance of 
the system, affecting factors such as energy usage, latency, and load distribution. This process is considered 
an NP-hard problem, as more than one critical performance factor must be considered to develop the optimal 
mapping algorithm. Thus, several methods have been proposed in the literature to address this issue, often 
relying on heuristic algorithms. Differential evolution (DE) is one such algorithm that delivers better perfor- 
mance with lower complexity. However, in this paper, we aim to enhance DE’s efficacy by coordinating it with 
other techniques. The remainder of this paper is organized as follows: section 2 provides an overview of NoC; 
section 3 highlights some relevant studies; section 4 discusses the application mapping problems, followed by 
a description of the Mapping problem with the differential evolution algorithm. In section 5, we showcase the 
experimental outcomes. In the last section, we conclude our work with a conclusion. 
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2. NOCS GENERALITIE 

The idea of NoC is derived from the networks initially developed for supercomputers, comprising a 
group of interconnected devices on a single chip that communicate through packets sent over a scalable inter- 
connection network. Compared to traditional bus architectures, NoC offers several advantages, such as energy 
efficiency, reliability, bandwidth scalability, and reusability [1]. The topology of a NoC is determined by how 
its components are interconnected to create the on-chip interconnect, which can take on various forms, such 
as 2D mesh, torus, and ring. In addition to the topology, other characteristics of NoC include communication 
mode, flow control mechanisms to prevent deadlock issues, and storage strategies Figure 1 presents an NoC 
with 9 tiles in 2D mesh topology. 


Figure 1. A 3x3 2D mesh NoC 


To design a NoC, the system involves multiple stages. Initially, the application is decomposed into a 
set of communication tasks that can be executed concurrently. After that, each task is assigned to an available 
core that is selected and scheduled. In the end, these cores must be mapped onto the NoC to complete the 
system design [2]. 

This paper specifically concentrates on the final stage of application mapping, which is a critical 
but still unresolved search problem. The optimal mapping solution can yield energy savings of up to 51.7% 
compared to ad hoc implementations, as demonstrated in [3]. To achieve high performance, finding the optimal 
mapping solution is essential. For instance, if there are m tasks to be mapped onto an NoC consisting of n 
cores where (m <= n), the number of potential solutions can reach up to n!/(n — m)!. Application mapping 
is a combinatorial optimization problem that is classified as NP-hard. In order to find a suboptimal solution, 
heuristic algorithms are typically used. 


3. RELATED WORK 

In NoC mapping, various approaches have been proposed, with particular attention given to the two- 
dimensional mesh topology. In this study, we review the most cited mapping techniques that consider mono- 
objective mapping. These techniques can be divided into two classes: meta-heuristic algorithms and heuristic 
approaches. 

Meta-heuristic algorithms are widely used to solve NP-hard problems and strive to achieve a solution 
that is close to optimal. Examples of such algorithms include genetic algorithms, ant colony optimization, and 
particle swarm optimization. For example, GBMAP and CGMAP use genetic algorithms, and in [6], 
an ACO-based algorithm is proposed to minimize the bandwidth requirement. In reference [7], a technique 
for optimizing performance using deterministic initial solutions has been proposed. Specifically, a discrete 
multiple particle swarm optimization (PSO) based mapping technique was utilized, where the behavior of 
swarm intelligence serves as a meta-heuristic. 

Unlike meta-heuristic algorithms, heuristic approaches are tailored to a specific problem and rely 
on specific cues to guide the search process. These cues are determined by the nature of the problem being 
addressed. For instance, NMAP selects an application’s core and maps them to tiles repeatedly, while 
BMAP [9] maps the cores according to traffic loads of cores. The CastNet algorithm described in reference 
[10] generates multiple solutions for mapping by using multiple tiles as initial tiles. The algorithm uses the 
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symmetric characteristics of a mesh to determine the optimal solution for each core, the number of available 
neighboring tiles for each tile is considered. 

Chmap determines the priority of each core by analyzing the communication needs and the data 
in the spanning tree. Next, the algorithm maps the chosen cores to the suitable tiles based on their priorities by 
establishing the mapping order of the cores. The ONYX algorithm utilizes four moves to assign a core to 
the tiles on a lozenge-shaped path and achieves a lower communication cost than previous mapping techniques. 
The Spiral approach, described in [I3], involves mapping the task with the highest priority at the center first, 
followed by the remaining tasks using a spiral path. Exact methods are also used in mapping but require a large 
amount of calculations over time, such as those based on integer linear programming (ILP) [14], or the branch 
and bound search method [15]. 


4. APPLICATION MAPPING PROBLEM 

Our research focuses on mapping steps that involve two key inputs: NoC and its architecture, and the 
application that is supposed to run in the NoC. In our case, we used a NoC with a 2D mesh topology shown in 
Figure 2. An application is made up of multiple concurrent tasks that need to be placed onto a core on the NoC. 
The goal of the mapping process is to achieve the best placement with minimum communication cost, which 
is a crucial factor that we consider in our paper. The optimal mapping solution is one that achieves the best 
placement while minimizing communication costs. The resulting mapping solution is presented in the form of 
a table, where each task is represented by an index 7, and the contents of the table represent the number of tiles 
assigned to each task during mapping. 


°° Å., 


Figure 2. Application mapping problem 


4.1. NoC Model 

The problem is defined using three distinct definitions. 

Definition 1: The core graph is constructed using a directional graph G(V, Æ). In this graph, each 
vertex v; represents a core, while a directional edge ez, 7 indicates the link between core vi and core vj. The 
weight of edge e; ; reflects the extent of communication between the two corresponding vertices. 

Definition 2: A graph A(T, L) represents the NOC architecture, where every vertex t; denotes a tile 
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within the NOC architecture. Meanwhile, each edge l; į; represents a physical link originating from t; and 
terminating at tj. 

Definition 3: The mapping function that associates each vertex v; in the core graph with a vertex t; in 
the NOC architecture is defined as follows. 


map: V + T map(v; = tj), Yv; € V St; E T 


In the core graph, each edge is considered as a flow of a single commodity, denoted as c*. This value 
represents the required bandwidth and can be expressed as vl c’. The set of all commodities is represented by 


C=c*: ul(c*) = comm; j k = 1... |E|comm; j € E 


with 


source(c*) = map(v;) and destination(c*) = map(v;) 


To determine the quantity of communication between v; and vj, it is necessary to count the number of hops 
between tile; and tile;. On a 2D mesh NoC, the X-Y routing algorithm is employed, and the number of hops 
can be determined using (1). 


Hops(tile;, tile;) = |X; — X;| + Y; = Y;| (1) 


In 2D mesh NoC the tiles i and j are represented by (X;, Y;) and (X;, Y;), respectively. 


4.2. Objective function 


This paper aims to minimize communication costs when mapping two tasks onto the NoC. The pri- 
mary strategy involves reducing the number of hops required for each communication within the application. 
The calculation of the communication cost formula involves utilizing (2). 


|E] 
commcost = 5 vl(C*) * nbhops(src(C*), dist(C*)) (2) 
k=1 


The source of a communication C* is denoted by src(C*), while its destination is represented by dist(C*). 


4.3. Differential evolution (DE) 


In 1997, Price and Storn proposed DE as an enhanced version of genetic algorithms. Like ge- 
netic algorithms, DE relies on an initial population and employs the same operator’s crossover, mutation, and 
selection. However, DE utilizes these operators in a different order, as illustrated in Figure [B] 

The primary distinction between genetic algorithms and DE lies in their respective approaches to 
building better solutions. Genetic algorithms utilize crossover, while DE relies on the operation of mutation. 
In DE, the mutation operation is the primary mechanism for searching and exploring potential regions in the 
search space, and the selection operator directs convergence toward those regions. 

According to [16], the DE/x/y/z notation is frequently employed to describe a DE strategy. The 
symbol x denotes the vector that will undergo mutation, which may be either a randomly selected population 
vector (rand) or the vector with the lowest cost in the current population (best). y specifies the number of 
differential vectors employed to perturb the target vector, and z denotes the crossover scheme, which could be 
either exponential or binomial. 
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Figure 3. Genetic agorithm vs differential evolution 


4.4. Differential evolution algorithm steps 

DE is a global optimization algorithm that operates at the population level. At the outset, it creates 
a population of N P individuals, each of dimension D, where each individual encodes a potential solution, 
denoted by X;,g = X} Go &X, Po with i = 1,..., N P and G indicating the generation to which the population 
belongs [16]. 

As noted in reference [16], DE is a population-level global optimization algorithm. Initially, DE 
creates a population consisting of N P individuals, each with a dimension of D. Each individual in the DE 
population represents a possible solution to the optimization problem being addressed represented by X;,g = 
X} Go &X, Po where 7 = 1,..., NP and G represents the generation to which the population refers. The DE 
algorithm starts by generating an initial population of individuals, which are randomly distributed across the 
search space. Subsequently, the algorithm follows a set of primary steps [17]. 


4.4.1. Mutation Operation: 


In the DE algorithm, during the generation G, the population is perturbed by the mutation operator 
which modifies each individual X;,g using a corresponding mutant vector V; g. The mutation operator can be 
generated using different strategies, among which the most frequently employed ones are listed in [I8]. 


— DErrand/1: 
Vie = Xna + F.(Xr,G — Xr5,0) (3) 
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— DE/best/1: 
Vig = Xvest a + F.(Xr,,G — Xr) (4) 
— DE/best/2: 
Via = Xvest,a + F.(Xr a — Xa.) + F.(Xr;, a — Xra,@) (5) 
— DE/rand/2: 
Vig = XrG + F.(Xr,G — Xr3,G) + F. (Xna = Xr a) (6) 
— DE/current-to-best/2: 
Via = Xia + F.(Xvest a — Xr a) + F.(Xr a — Xr3,0) (7) 
— DE/current-to-rand/2: 
Vie = Xia + F.(An,@ — Xr,G) + F(X, — Ara) (8) 


The variable V; g represents the mutant vector that is being created. The integers r1,r2,73,74, and rs are 
randomly generated constants within the range [1, NP] and different from the index j. The variable Xpest.c 
corresponds to the best individual in the population at generation G. The scale factor F is a real constant that 
is typically selected in the range [0, 1] and determines the degree of amplification of the difference variation. 


4.4.2. Crossover operation 
Following the mutation phase, the diversity of the population is increased through the application of 

the crossover operation in DE. This operation utilizes the mutant vector V; g created during the mutation phase 

to exchange its components with the target vector X; g, thereby producing the test vector U;,g. The crossover 

operation can be expressed using the formulation presented in [19]. 

Vig If(rand;[0, 1] < CR)or(j = jrana) 


ee i 


j i 
Uia -f Xia Otherwise 0) 


In the aforementioned expression, j is an integer that varies between 1 and D. The variable rand; corresponds 
to the jth evaluation of a uniform random number generator that produces values within the interval [0, 1], as 
described in [20]. The crossover rate, denoted by C'R, is a constant specified by the user and takes values 
within the range [0, 1]. Additionally, j,-ana is a random integer selected from the range [1, D], as stated in [20]. 


4.4.3. Selection operation 

Once the test vector U; gq has been created through the crossover operation, the selection operation is 
carried out to maintain the population size for the next generation. To accomplish this, the objective function 
is evaluated for both the target vector X; g and the test vector U; @. The vector that yields a better fitness 
value is retained in the population for the subsequent generation. More specifically, if the fitness value of the 
test vector is superior to that of the target vector, then the test vector replaces the target vector. On the other 
hand, if the fitness value of the target vector is better, it remains in the population. The selection procedure is 
mathematically represented as shown in [2T]. 


Ui If f(Uig) < f(Xie) 


Xic+ = Xia Otherwise (10) 


The three steps (mutation, crossover, and selection) are repeated for each generation up to a termination crite- 
rion. 

The DE algorithm is characterized by the interaction between the different particles. The mechanism 
responsible for generating new potential solutions is the imitation of the global behavior of the neighborhood. 
Algorithm|I|presents a classic version of the DE algorithm. 

The parameters used in this algorithm[T]are: 

— D The dimension of the problem. 
— N The number of individuals. 


— F The values of the scale factor. 
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— Cr The crossover rate. 
— Maz;t The maximum number of iterations Maxit. 
— The choice of mutation strategy. 


Algorithm 1 Differential evolution algorithm 


Initialize the individuals of the population 
Evaluate all individus 
Initialize the best solution (Best) 
iteration := 0 
while iteration < max_iteration do 
for each individu do 
(a) Generate a mutant vector using the mutation ety 
(b) Generate the test vector using the crossover operatio 
(c) Evaluate the try vector 
(d)If the test trial is better than the individual, replace it-43] 
end for 
Update The Best solution 
iteration := iteration + 1 
end while 
Return the best solution 


5. EXPERIMENT RESULT 


To assess the performance of the NoC, a group of benchmarks, as described in [14], are employed. 
A benchmark comprises a series of tasks that communicate with one another. Figure [4] (a) to (c) represents 
three commonly used benchmarks in testing, which are the Video Object Plane Decoder (VOPD) benchmark, 


(b) Moving Picture Experts Group (MPEG4) benchmark, and (c) Multi-Window Display (MWD) benchmark, 
respectively. 


Figure 4. The benchmarks utilized in the study include (a) VOPD benchmark, (b) MPEG4 benchmark, and (c) 
MWD benchmark 


Before presenting the result, it is crucial to specify the parameters that were employed in our study. A 
4x4 NoC was utilized, and the parameters are listed in Table[I] 


Table 1. Parameter of DE used in experiment 


Paremeter Value used 
CR 0.001 ... 1.0 
F 0.01 ... 1.0 


Pop size 200; 500; 1,000 
Max it 500; 1,000; 2,000 


After conducting multiple test attempts, we concluded that each application requires its own set of 
parameters to achieve the best results. Therefore, we have adopted the parameters presented in Table[2] 
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Table 2. Parameter used in DEMAP algorithms 
Noc Size CR F Pop Zise Maxit 
4x4 0.0001 0.85 1,000 2,000 


We utilized multiple mutation operations to determine which one would lead us to the optimal out- 
come. The results obtained from the five mutation operations are shown in Table B] 


Table 3. Result obtained 
Application Rand/1 Rand/2 Current to best/1 Best/1 Best/2 


VOPD 4701 4743 4965 4119 4119 
MWD 1280 1216 1248 1184 1184 
MPEG4 3660 3666 3667 3470 3470 


Figure[5jenables a visual assessment of the mapping performance, based on the five mutation strategies 
utilized during the experiment, as executed by the DEMAP algorithm. The outcomes indicate that utilizing the 
best strategy consistently yields the optimal results, surpassing the alternative strategies employed in DEMAP. 
To further establish the effectiveness of our approach, we compared the results achieved using our best approach 
with some other techniques, as shown in Table/4| Figures [6] to [8] provide a visual comparison between the 
communication cost results of DEMAP mapping and the compared approach, allowing for an easy analysis of 
the best results. 
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Table 4. DEMAP result compared to other approach 


Results VOPD MWD MPEG4 
DEMAP 4119 1184 3470 


ILP 4119 1120 3567 
Castnet 4135 1280 3852 
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Based on the obtained outcomes, we can infer that DEMAP produces superior results compared to 
other meta-heuristic approaches. The results are also comparable to the best outcomes achieved using heuristic 
approaches, even though these approaches are typically more effective in optimizing a single objective. 


6. CONCLUSION 


The process of mapping applications onto a network-on-chip (NoC) is a complex task that is known to 
be NP-hard. To address this challenge, we introduce a novel approach based on evolutionary strategy, specif- 
ically the differential evolution (DE) algorithm. Our approach, named DEMAP, employs a mono-objective 
strategy that seeks to minimize the communication cost. The experiments were carried out on an actual Cores 
Graph, and the results indicate that DEMAP can produce better outcomes when the algorithm parameters are 
appropriately tuned. As a part of our future work, we plan to investigate the effectiveness of collaborative 
techniques in multi-objective optimization mapping and compare it with other established methods. 
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